Factors shaping the abundance and diversity of the gut archaeome across the animal kingdom

Archaea are common constituents of the gut microbiome of humans, ruminants, and termites but little is known about their diversity and abundance in other animals. Here, we analyse sequencing and quantification data of archaeal and bacterial 16S rRNA genes from 250 species of animals covering a large taxonomic spectrum. We detect the presence of archaea in 175 animal species belonging to invertebrates, fish, amphibians, birds, reptiles and mammals. We identify five dominant gut lineages, corresponding to Methanobrevibacter, Methanosphaera, Methanocorpusculum, Methanimicrococcus and “Ca. Methanomethylophilaceae”. Some archaeal clades, notably within Methanobrevibacter, are associated to certain hosts, suggesting specific adaptations. The non-methanogenic lineage Nitrososphaeraceae (Thaumarchaeota) is frequently present in animal samples, although at low abundance, but may have also adapted to the gut environment. Host phylogeny, diet type, fibre content, and intestinal tract physiology are major drivers of the diversity and abundance of the archaeome in mammals. The overall abundance of archaea is more influenced by these factors than that of bacteria. Methanogens reducing methyl-compounds with H2 can represent an important fraction of the overall methanogens in many animals. Together with CO2-reducing methanogens, they are influenced by diet and composition of gut bacteria. Our results provide key elements toward our understanding of the ecology of archaea in the gut, an emerging and important field of investigation.

of ASVs from this archaeal order cluster with Methanocorpusculum reference sequences while none of the ASVs cluster with Methanomicrobium (Supplementary Figure 4). ASVs from Methanocorpusculum gather into two main clades. One clade has ASVs from a variety of mammalian orders -i.e., Rodentia, Diprotodontia, Carnivora, Pilosa, and Primates-as well as ASVs derived from amphibians, invertebrates, reptiles (Squamata), and birds. In this clade, three ASVs from Rodentia are well separated from the others and are only present in members of this animal order. The second large clade subdivides into several subclades each of which is composed of sequences with a strong specificity for Testudines, Perissodactyla, and Cetartiodactyla (Supplementary Figure 4). The branching of ASVs from Testudines among ASVs from Perissodactyla (and the associated paraphyly of the ASVs from Perissodactyla) could be due to phylogenetic biases.
Methanosarcinales are sparsely distributed throughout animal species, with a high relative abundance in some reptiles and insect-eating mammals. In terms of number of reads, a single genus -Methanimicrococcus was the fourth most abundant archaea (Figure 2a). This order of methanogens had the lowest level of diversity with just 81 ASVs. These are spilt between to genera, Methanimicrococcus (55 ASVs) and Methanosarcina (26 ASVs) (Supplementary Figure  18). However, 92.7% of the Methanosarcinales reads were attributed to Methanimicrococcus, while Methanosarcina only accounted for 7.3%. The genus Methanimicrococcus has largely been associated with the digestive tract of termites and cockroaches  , but recently was shown to form two distinct clades -one of sequences obtained from insects and another from mammals Sup12 . One ASV that clustered with insect-derived Methanimicrococcus reference sequences came from a frog and a toad possibly indicating that these amphibians acquired this methanogen from their prey. Similarly, we found that several ASVs from other animals that feed on invertebrates such as the long-spine squirrel fish, common gull, Eurasian coot, Great spotted woodpecker all cluster within the same clade of the insect-enriched reference sequences of Methanimicrococcus. The largest number of reads we obtained for the Methanimicrococcus came from animals whose primary diet is invertebrates (Figure 3d). Most Methanimicrococcus mammalian-derived ASVs fall within the previously proposed mammal clade Sup12 (Supplementary Figure 18).

Cooccurrence analysis
It was not possible to run a cooccurrence analysis on all animals because of the sampling distribution and the number of ASVs (clustered into OTUs) shared between groups. However, it was possible to analyze several lineages of animal for which enough samples were collected and there was enough overlap between ASVs. This analysis was thus run on all mammal samples and independently on mammalian orders, i.e., Primates, Perissodactyla/Cetartiodactyla and Carnivora. Further, because few studies have been completed targeting the ecology of intestinal archaea in birds and reptiles and we had enough samples for robust statistical analyses, we also ran a cooccurrence analysis on each of these two classes of animals.
Mammals. Across mammals we identified 620 OTUs that were present in >10% of all mammal species. There were six archaea-bacteria relationships identified in both co-occurrence algorithms, four of which were between methanogens and Clostridiales (Supplementary Data 2). In addition to its cooccurrence with an OTUs closely related to Lachnospira pectinoschiza, Ca. Methanomethylophilaceae OTUarc_11 was also found to be significantly correlated to the presence of a bacterial-OTU belonging to the family Muribaculaceae (phylum Bacteroidetes) which was particularly abundant in the Cingulata. A BLAST of this bacterial OTU sequence did not result in any significant results. This family of bacteria was previously identified to be a dominant member of the rodent intestinal microbiome and to have the capacity to degrade pectin Sup13,14 , likely forming methanol. However, the lack of a clear annotation of this bacterial OTU leaves the details of the relationship between these archaea and bacteria unclear. The Ca. Nitrosocosmicus OTUarc_45 (including ASV4) -the most widespread Thaumarchaeota in our dataset -was significantly linked to Solobacterium which was isolated from human feces Sup15 . The cooccurrence of these two OTUs is also observed when only including Primates samples in the analysis.
Ungulata: Perissodactyla and Cetartiodactyla. The highest number of archaeal-bacterial relationships was observed in the Ungulata (Perissodactyla and Cetartiodactyla; Supplementary Data 2) with 17 positive associations and seven negative associations. These orders also have the highest concentration of archaea on average (3.3x10 8 16S rRNA gene copies/gram of feces), as well as some of the highest diversity of archaea. Nine positive relationships we identified were between Archaea and members of the Clostridiales, four were between Archaea and members of the Bacteroidetes, and the other four were between Archaea and either Melainabacteria, Patescibacteria, or Tenericutes (Supplementary Data 2).
Two Christensenellaceae OTUs (OTUbac_67 and OTUbac_2376) were negatively correlated to methanogen OTUs: a Methanobrevibacter OTUarc_20 (corresponding to M. ruminantium) and Ca. Methanomethylophilaceae OTUarc_189; Supplementary Data 2). This is surprising as Christensenellaceae have previously been shown to be positively correlated with Methanobrevibacter smithii in the human intestine and some of its representatives support the growth of this methanogen Sup16 . However, a third Christensenellaceae OTU (OTUbac_503) was positively correlated to Methanobrevibacter OTUarc_33 (corresponding to Methanobrevibacter wolinii; Supplementary Data 2). Methanobrevibacter OTUarc_20 and Ca. Methanomethylophilaceae OTUarc_189 were also negatively correlated with a Ruminococcaceae OTU and a Patescibacteria OTU, respectively.
Primates. All the archaea-bacteria associations we identified in primates were between archaea and Firmicutes. The positive association of a pectin-degrading bacterium Lachnospira (OTUbac_2345) and Ca. Methanomethylophilaceae OTUarc_11, found when considering all Mammals, was also identified when the microbial community of primates was analyzed independently. Although the edge stability was slightly below the cut off threshold of 0.5, all the other metrics indicate the significance of this relationship (edge stability = 0.45, p = 0.05, rho =0.42). Two CO2-reducing hydrogenotrophic methanogens the Methanobrevibacter and Methanocorpusculum were also found to be positively correlated to members of uncharacterized Ruminococcaceae one of the most abundant family of bacteria in the gut of mammals Sup17 .
Carnivora. Carnivora members host a unique community of intestinal archaea among mammals -a low concentration of archaea, and a number of them have an archaeome dominated by Thaumarchaeota -thus we also performed an individual cooccurrence analysis on this group of mammals. OTUs belonging to the Methanobacteriales and Nitrososphaerales are significantly correlated to several members of the Firmicutes and one Proteobacteria in the Carnivora (Supplementary Data 2). Methanobacterium (OTUarc_325 -95% similarity to Methanobacterium formicicum) and Methanobrevibacter (OTUarc_20 -98% similarity to M. olleyae/ruminantium) -were positively associated to OTUs from the Firmicutes like Lachnospiraceae (Blautia and Tyzzerella_4, respectively). Further, we found that Methanobrevibacter was positively correlated to the presence of a Lactobacillus OTU (Supplementary Data 2). A Ca. Nitrosocosmicus OTU (Thaumarchaeota) was positively linked to a Staphylococcus OTU (Firmicutes), but negatively linked to Enterobacteriaceae OTU corresponding to Escherichia/Shigella (Proteobacteria). The cooccurrence between a dominant intestinal bacterium and a newly identified genus of the intestinal archaea warrants further investigation. Aves and Reptiles. In Aves, no archaea-bacteria relationships shared between the SparCC and SPIEC-EASI approaches (Supplementary Data 2). However, the SparCC approach, identified one significant archaea-bacteria relationship between a Methanosphaera OTU and a Clostridiales (Clostridium sensu stricto 1); and the SPIEC-EASI approach identified 57 relationships between archaea and bacteria. Interestingly this approach identified that M. smithii (OTUarc_1), is negatively correlated to several bacterial OTUs, including common gut bacterial groups such as Clostridia and Arthrobacter -a species previously characterized as being part of the goose core-gut microbiome Sup18 . It is currently not clear as to why this apparently well-adapted intestinal archaea would be negatively related to common intestinal bacteria in birds. Other archaeal OTUs such as Methanosphaera, Ca. Methanomethylophilaceae, and Nitrososphaeraceae were positively correlated to various bacteria groups (Supplementary Data 2). For example, one of the Nitrososphaeraceae OTUs was positively correlated with a Blautia OTU. The same bacterial OTU was also negatively correlated to OTUarc_1.
Two archaeal OTUs were linked to bacteria in reptiles using both approaches. A Methanocorpusculum OTU is significantly positively correlated to a Providencia OTU (100% identical to Providencia rettgeri; Enterobacteriales) which is a common constituent of the human and reptile intestinal tract Sup19 . Also, Methanomassiliicoccus is positively correlated to the Clostridiaceae_1 family (95% identical to Clostridium cylindrosporum).
Cooccurrence between archaeal OTUs. Globally, these analyses also highlighted an absence of negative archaea-archaea relationship and several positive archaea-archaea relationships between OTUs of a same family. This was notably the case between Methanobrevibacter and Methanosphaera OTUs (Methanobacteriaceae) in Primates, Ungulata and overall mammals, but no association were observed between Methanobrevibacter OTUs or between Methanosphaera OTUs. However, a Ca. Methanomethylophilaceae OTUs was also associated to another Ca. Methanomethylophilaceae OTUs in Primates (Supplementary Data 2). Nitrososphaeraceae OTUs were positively associated to each other in mammals. 20,21,30-39,22,40,41,23-29 sup42 Table S1: Percentage of ASVs (n = 1307) and total reads corresponding to characterized archaea (isolates/enriched). ASVs sequences were compared to 16S rRNA genes of type strain archaea in the SILVA Living Tree Project LTP plus additional sequences of enriched archaea whose genome was sequenced (custom database) and with the SILVA Living Tree Project LTP alone. Marine sediments Marine 1.282E-07 10 -120 sup27 7

Figure S5
C a rn iv o ra A n u ra Tree scale: 0.1